Recoding

Quantitative Methodology (UPF)

Jordi Mas Elias

https://www.jordimas.cat/

Summary

  • Boolean operations
  • Relabel functions
  • As factor
  • As functions

Warm up

R learning curve

Hall of fame

Best plots!

RStudio workflow

Load packages.

library(dplyr)
library(ggplot2)
library(readr)
library(stringr)
library(forcats)
library(readxl)
library(vdemdata)

Functions and operators

Boolean operators

  • AND (&): TRUE if all conditions are met.
  • OR (|): TRUE if any condition is met.
  • NOT (!): TRUE if conditions are not met.

Recoding functions

When we recode variables, we lose information1.

To Function
Binary if_else()
Categorical case_when(), case_match()
Ordinal factor()
Others as.numeric(), as.character(), as.Date(), etc.

Relabel

If_else

  • To a dichotomous / binary / dummy variable.
df |> 
  mutate(new_name = if_else(logic operation, true, false))

Case_when

df |> 
  mutate(new_var = case_when(logic operation ~ "C1",
                             logic operation ~ "C2",
                             logic operation ~ "C3",
                             ...,
                             .default ~ "CN"))

Case_match

df |> 
  mutate(new_var = case_match(variable,
                              "c1" ~ "C1",
                              "c2" ~ "C2",
                              "c3" ~ "C3",
                             ...,
                             .default = "CN"))

Factor

df |> 
  mutate(new_vector = factor(wb$income_group, 
                             ordered = TRUE,
                             [levels o labels = ...]))

As functions

  • as.numeric(vector)
  • as.factor(vector)
  • as.character(vector)
  • as.integer(vector)
  • as.Date(vector)